Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR

Identifieur interne : 000759 ( Main/Exploration ); précédent : 000758; suivant : 000760

A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR

Auteurs : Ignacio Abreu Salas [Cuba] ; Ram N Rico-Juan [Espagne]

Source :

RBID : ISTEX:492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4

Abstract

Abstract: This paper presents a new fast algorithm to compute an approximation to the median between two strings of characters representing a 2D shape and its application to a new classification scheme to decrease its error rate. The median string results from the application of certain edit operations from the minimum cost edit sequence to one of the original strings. The new dataset editing scheme relaxes the criterion to delete instances proposed by the Wilson Editing Procedure. In practice, not all instances misclassified by its near neighbors are pruned. Instead, an artificial instance is added to the dataset expecting to successfully classify the instance on the future. The new artificial instance is the median from the misclassified sample and its same-class nearest neighbor. The experiments over two widely used datasets of handwritten characters show this preprocessing scheme can reduce the classification error in about 78% of trials.

Url:
DOI: 10.1007/978-3-642-14980-1_74


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR</title>
<author>
<name sortKey="Abreu Salas, Ignacio" sort="Abreu Salas, Ignacio" uniqKey="Abreu Salas I" first="Ignacio" last="Abreu Salas">Ignacio Abreu Salas</name>
</author>
<author>
<name sortKey="Rico Juan, Ram N" sort="Rico Juan, Ram N" uniqKey="Rico Juan R" first="Ram N" last="Rico-Juan">Ram N Rico-Juan</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-14980-1_74</idno>
<idno type="url">https://api.istex.fr/document/492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002048</idno>
<idno type="wicri:Area/Istex/Curation">001F10</idno>
<idno type="wicri:Area/Istex/Checkpoint">000339</idno>
<idno type="wicri:doubleKey">0302-9743:2010:Abreu Salas I:a:new:editing</idno>
<idno type="wicri:Area/Main/Merge">000764</idno>
<idno type="wicri:Area/Main/Curation">000759</idno>
<idno type="wicri:Area/Main/Exploration">000759</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR</title>
<author>
<name sortKey="Abreu Salas, Ignacio" sort="Abreu Salas, Ignacio" uniqKey="Abreu Salas I" first="Ignacio" last="Abreu Salas">Ignacio Abreu Salas</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Cuba</country>
<wicri:regionArea>Universidad de Matanzas</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Cuba</country>
</affiliation>
</author>
<author>
<name sortKey="Rico Juan, Ram N" sort="Rico Juan, Ram N" uniqKey="Rico Juan R" first="Ram N" last="Rico-Juan">Ram N Rico-Juan</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto Lenguajes y Sistemas Informáticos, Universidad de Alicante</wicri:regionArea>
<placeName>
<settlement type="city">Alicante</settlement>
<region nuts="2" type="region">Communauté valencienne</region>
</placeName>
<orgName type="university">Université d'Alicante</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Espagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4</idno>
<idno type="DOI">10.1007/978-3-642-14980-1_74</idno>
<idno type="ChapterID">74</idno>
<idno type="ChapterID">Chap74</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: This paper presents a new fast algorithm to compute an approximation to the median between two strings of characters representing a 2D shape and its application to a new classification scheme to decrease its error rate. The median string results from the application of certain edit operations from the minimum cost edit sequence to one of the original strings. The new dataset editing scheme relaxes the criterion to delete instances proposed by the Wilson Editing Procedure. In practice, not all instances misclassified by its near neighbors are pruned. Instead, an artificial instance is added to the dataset expecting to successfully classify the instance on the future. The new artificial instance is the median from the misclassified sample and its same-class nearest neighbor. The experiments over two widely used datasets of handwritten characters show this preprocessing scheme can reduce the classification error in about 78% of trials.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Cuba</li>
<li>Espagne</li>
</country>
<region>
<li>Communauté valencienne</li>
</region>
<settlement>
<li>Alicante</li>
</settlement>
<orgName>
<li>Université d'Alicante</li>
</orgName>
</list>
<tree>
<country name="Cuba">
<noRegion>
<name sortKey="Abreu Salas, Ignacio" sort="Abreu Salas, Ignacio" uniqKey="Abreu Salas I" first="Ignacio" last="Abreu Salas">Ignacio Abreu Salas</name>
</noRegion>
<name sortKey="Abreu Salas, Ignacio" sort="Abreu Salas, Ignacio" uniqKey="Abreu Salas I" first="Ignacio" last="Abreu Salas">Ignacio Abreu Salas</name>
</country>
<country name="Espagne">
<region name="Communauté valencienne">
<name sortKey="Rico Juan, Ram N" sort="Rico Juan, Ram N" uniqKey="Rico Juan R" first="Ram N" last="Rico-Juan">Ram N Rico-Juan</name>
</region>
<name sortKey="Rico Juan, Ram N" sort="Rico Juan, Ram N" uniqKey="Rico Juan R" first="Ram N" last="Rico-Juan">Ram N Rico-Juan</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000759 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000759 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4
   |texte=   A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024